Authorship Attribution of Texts: A Review
Identifieur interne : 000393 ( Main/Exploration ); précédent : 000392; suivant : 000394Authorship Attribution of Texts: A Review
Auteurs : B. MalyutovSource :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2006.
Abstract
Abstract: We survey the authorship attribution of documents given some prior stylistic characteristics of the author’s writing extracted from a corpus of known works, e.g., authentication of disputed documents or literary works. Although the pioneering paper based on word length histograms appeared at the very end of the nineteenth century, the resolution power of this and other stylometry approaches is yet to be studied both theoretically and on case studies such that additional information can assist finding the correct attribution. We survey several theoretical approaches including ones approximating the apparently nearly optimal one based on Kolmogorov conditional complexity and some case studies: attributing Shakespeare canon and newly discovered works as well as allegedly M. Twain’s newly-discovered works, Federalist papers binary (Madison vs. Hamilton) discrimination using Naive Bayes and other classifiers, and steganography presence testing. The latter topic is complemented by a sketch of an anagrams ambiguity study based on the Shannon cryptography theory.
Url:
DOI: 10.1007/11889342_20
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000485
- to stream Istex, to step Curation: 000416
- to stream Istex, to step Checkpoint: 000361
- to stream Main, to step Merge: 000412
- to stream Main, to step Curation: 000410
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Authorship Attribution of Texts: A Review</title>
<author><name sortKey="Malyutov, B" sort="Malyutov, B" uniqKey="Malyutov B" first="B." last="Malyutov">B. Malyutov</name>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:F438C14FA8A87CCBCE3F5DF947E3C1395655768B</idno>
<date when="2006" year="2006">2006</date>
<idno type="doi">10.1007/11889342_20</idno>
<idno type="url">https://api.istex.fr/document/F438C14FA8A87CCBCE3F5DF947E3C1395655768B/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000485</idno>
<idno type="wicri:Area/Istex/Curation">000416</idno>
<idno type="wicri:Area/Istex/Checkpoint">000361</idno>
<idno type="wicri:doubleKey">0302-9743:2006:Malyutov B:authorship:attribution:of</idno>
<idno type="wicri:Area/Main/Merge">000412</idno>
<idno type="wicri:Area/Main/Curation">000410</idno>
<idno type="wicri:Area/Main/Exploration">000393</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Authorship Attribution of Texts: A Review</title>
<author><name sortKey="Malyutov, B" sort="Malyutov, B" uniqKey="Malyutov B" first="B." last="Malyutov">B. Malyutov</name>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2006</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="eISSN">1611-3349</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">F438C14FA8A87CCBCE3F5DF947E3C1395655768B</idno>
<idno type="DOI">10.1007/11889342_20</idno>
<idno type="ChapterID">Chap20</idno>
<idno type="ChapterID">20</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: We survey the authorship attribution of documents given some prior stylistic characteristics of the author’s writing extracted from a corpus of known works, e.g., authentication of disputed documents or literary works. Although the pioneering paper based on word length histograms appeared at the very end of the nineteenth century, the resolution power of this and other stylometry approaches is yet to be studied both theoretically and on case studies such that additional information can assist finding the correct attribution. We survey several theoretical approaches including ones approximating the apparently nearly optimal one based on Kolmogorov conditional complexity and some case studies: attributing Shakespeare canon and newly discovered works as well as allegedly M. Twain’s newly-discovered works, Federalist papers binary (Madison vs. Hamilton) discrimination using Naive Bayes and other classifiers, and steganography presence testing. The latter topic is complemented by a sketch of an anagrams ambiguity study based on the Shannon cryptography theory.</div>
</front>
</TEI>
<affiliations><list></list>
<tree><noCountry><name sortKey="Malyutov, B" sort="Malyutov, B" uniqKey="Malyutov B" first="B." last="Malyutov">B. Malyutov</name>
</noCountry>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Wicri/Musique/explor/MonteverdiV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 000393 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 000393 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Wicri/Musique |area= MonteverdiV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:F438C14FA8A87CCBCE3F5DF947E3C1395655768B |texte= Authorship Attribution of Texts: A Review }}
This area was generated with Dilib version V0.6.21. |